Search results for "Markov decision process"

showing 10 items of 22 documents

MDP-Based Resource Allocation Scheme Towards a Vehicular Fog Computing with Energy Constraints

2018

As mobile applications deliver increasingly complex functionalities, the demands for even more intensive computation would quickly transcend energy capability of mobile devices. On one hand and in an attempt to address such issues, fog computing paradigm is introduced to mitigate the limited energy and computation resources available within constrained mobile devices, by moving computation resources closer to their users at the edge of the access network. On another hand, most of electric vehicles (EVs), with increasing computation, storage and energy capabilities, spend more than 90% of time on parking lots. In this paper, we conceive the basic idea of using the underutilized computation r…

Access network0203 mechanical engineeringComputer scienceDistributed computing020208 electrical & electronic engineering0202 electrical engineering electronic engineering information engineeringResource allocation020302 automobile design & engineering02 engineering and technologyEnhanced Data Rates for GSM EvolutionMarkov decision processMobile device2018 IEEE Global Communications Conference (GLOBECOM)

researchProduct

Safer Reinforcement Learning for Agents in Industrial Grid-Warehousing

2020

In mission-critical, real-world environments, there is typically a low threshold for failure, which makes interaction with learning algorithms particularly challenging. Here, current state-of-the-art reinforcement learning algorithms struggle to learn optimal control policies safely. Loss of control follows, which could result in equipment breakages and even personal injuries.

Artificial neural networkComputer scienceSAFERControl (management)0202 electrical engineering electronic engineering information engineeringReinforcement learning020206 networking & telecommunications02 engineering and technologyMarkov decision processGridOptimal controlIndustrial engineering

researchProduct

Increasing sample efficiency in deep reinforcement learning using generative environment modelling

2020

Artificial neural networkComputer sciencebusiness.industrySample (statistics)Machine learningcomputer.software_genreTheoretical Computer ScienceComputational Theory and MathematicsArtificial IntelligenceControl and Systems EngineeringReinforcement learningMarkov decision processArtificial intelligencebusinesscomputerVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550Generative grammar

researchProduct

CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning

2020

Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment. The learning typically happens through trial and error using explorative methods, such as \(\epsilon \)-greedy. There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines. Model-based RL learns a model of the environment for learning the policy while model-free approaches are fully explorative and exploitative without considering the underlying environment dynamics. Model-free RL works conceptually well in simulated environments, and empirical evidence suggests that trial and error lead to a near-opti…

Artificial neural networkEnd-to-end principlebusiness.industryComputer scienceReinforcement learningSample (statistics)Markov decision processArtificial intelligenceEmpirical evidenceTrial and errorbusinessFeature learning

researchProduct

Some Effects of Individual Learning on the Evolution of Sensors

2001

In this paper, we present an abstract model of sensor evolution, where sensor development is only determined by artificial evolution and the adaptation of agent reactions is accomplished by individual learning. With the environment cast into a MDP framework, sensors can be conceived as a map from environmental states to agent observations and Reinforcement Learning algorithms can be utilised. On the basis of a simple gridworld scenario, we present some results of the interaction between individual learning and evolution of sensors.

Basis (linear algebra)business.industryComputer scienceIndividual learningEvolutionary algorithmReinforcement learningMarkov decision processArtificial intelligencebusinessAdaptation (computer science)

researchProduct

Continuous energy-efficient monitoring model for mobile ad hoc networks

2021

The monitoring of mobile ad hoc networks is an observation task that consists of analysing the operational status of these networks while evaluating their functionalities. In order to allow the whole network and applications to work properly, the monitoring task has become of considerable importance. It must be carried out in real-time by performing measurements, logs, configurations, etc. However, achieving continuous energy-efficient monitoring in mobile wireless networks is very challenging considering the environment features as well as the unpredictable behavior of the participating nodes. This paper outlines the challenges of continuous energy-efficient monitoring over mobile ad hoc n…

Computer scienceWireless networkDistributed computingContinuous monitoringTask analysisEnergy consumptionMobile ad hoc networkMarkov decision processEfficient energy useTask (project management)2021 International Wireless Communications and Mobile Computing (IWCMC)

researchProduct

MDP-based Resource Allocation for Uplink Grant-free Transmissions in 5G New Radio

2020

The diversity of application scenarios in 5G mobile communication networks calls for innovative initial access schemes beyond traditional grant-based approaches. As a novel concept for facilitating small packet transmission and achieving ultra-low latency, grant-free communication is attracting lots of interests in the research community and standardization bodies. However, when a network consists of both grant based and grant-free based end devices, how to allocate slot resources properly between these two categories of devices remains as an unanswered question. In this paper, we propose a Markov decision process based scheme which dynamically allocates grant-free resources based on a spec…

Computer sciencebusiness.industry05 social sciences050801 communication & media studies0508 media and communications0502 economics and businessTelecommunications linkResource allocation050211 marketingMobile telephonyMarkov decision processbusiness5GComputer network2020 IEEE Wireless Communications and Networking Conference (WCNC)

researchProduct

Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

2018

Active inference is an ambitious theory that treats perception, inference and action selection of autonomous agents under the heading of a single principle. It suggests biologically plausible explanations for many cognitive phenomena, including consciousness. In active inference, action selection is driven by an objective function that evaluates possible future actions with respect to current, inferred beliefs about the world. Active inference at its core is independent from extrinsic rewards, resulting in a high level of robustness across e.g.\ different environments or agent morphologies. In the literature, paradigms that share this independence have been summarised under the notion of in…

FOS: Computer and information sciencesComputer scienceComputer Science - Artificial Intelligencepredictive informationBiomedical EngineeringInferenceSystems and Control (eess.SY)02 engineering and technologyAction selectionI.2.0; I.2.6; I.5.0; I.5.1lcsh:RC321-57103 medical and health sciences0302 clinical medicineactive inferenceArtificial IntelligenceFOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineeringFormal concept analysisMethodsperception-action loopuniversal reinforcement learningintrinsic motivationlcsh:Neurosciences. Biological psychiatry. NeuropsychiatryFree energy principleCognitive scienceRobotics and AII.5.0I.5.1I.2.6Partially observable Markov decision processI.2.0Artificial Intelligence (cs.AI)Action (philosophy)empowermentIndependence (mathematical logic)free energy principleComputer Science - Systems and Control020201 artificial intelligence & image processingBiological plausibility62F15 91B06030217 neurology & neurosurgeryvariational inference

researchProduct

Towards Model-Based Reinforcement Learning for Industry-Near Environments

2019

Deep reinforcement learning has over the past few years shown great potential in learning near-optimal control in complex simulated environments with little visible information. Rainbow (Q-Learning) and PPO (Policy Optimisation) have shown outstanding performance in a variety of tasks, including Atari 2600, MuJoCo, and Roboschool test suite. Although these algorithms are fundamentally different, both suffer from high variance, low sample efficiency, and hyperparameter sensitivity that, in practice, make these algorithms a no-go for critical operations in the industry.

HyperparameterArtificial neural networkComputer sciencebusiness.industrySample (statistics)Variance (accounting)Machine learningcomputer.software_genreVariety (cybernetics)Test suiteReinforcement learningArtificial intelligenceMarkov decision processbusinesscomputer

researchProduct

Explainable Reinforcement Learning with the Tsetlin Machine

2021

The Tsetlin Machine is a recent supervised machine learning algorithm that has obtained competitive results in several benchmarks, both in terms of accuracy and resource usage. It has been used for convolution, classification, and regression, producing interpretable rules. In this paper, we introduce the first framework for reinforcement learning based on the Tsetlin Machine. We combined the value iteration algorithm with the regression Tsetlin Machine, as the value function approximator, to investigate the feasibility of training the Tsetlin Machine through bootstrapping. Moreover, we document robustness and accuracy of learning on several instances of the grid-world problem.

Learning automataComputer sciencebusiness.industryBootstrappingMachine learningcomputer.software_genreRegressionConvolutionRobustness (computer science)Bellman equationReinforcement learningMarkov decision processArtificial intelligenceMathematics::Representation Theorybusinesscomputer

researchProduct